influenza activity
Flusion: Integrating multiple data sources for accurate influenza predictions
Ray, Evan L., Wang, Yijin, Wolfinger, Russell D., Reich, Nicholas G.
Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admissions reported in the CDC's National Healthcare Safety Network (NHSN) surveillance system. Reporting of influenza hospital admissions through NHSN began within the last few years, and as such only a limited amount of historical data are available for this signal. To produce forecasts in the presence of limited data for the target surveillance system, we augmented these data with two signals that have a longer historical record: 1) ILI+, which estimates the proportion of outpatient doctor visits where the patient has influenza; and 2) rates of laboratory-confirmed influenza hospitalizations at a selected set of healthcare facilities. Our model, Flusion, is an ensemble that combines gradient boosting quantile regression models with a Bayesian autoregressive model. The gradient boosting models were trained on all three data signals, while the autoregressive model was trained on only the target signal; all models were trained jointly on data for multiple locations. Flusion was the top-performing model in the CDC's influenza prediction challenge for the 2023/24 season. In this article we investigate the factors contributing to Flusion's success, and we find that its strong performance was primarily driven by the use of a gradient boosting model that was trained jointly on data from multiple surveillance signals and locations. These results indicate the value of sharing information across locations and surveillance signals, especially when doing so adds to the pool of available training data.
Influenza Modeling Based on Massive Feature Engineering and International Flow Deconvolution
Liu, Ziming, Wang, Yixuan, Han, Zizhao, Wu, Dian
In this article, we focus on the analysis of the potential factors driving the spread of influenza, and possible policies to mitigate the adverse effects of the disease. To be precise, we first invoke discrete Fourier transform (DFT) to conclude a yearly periodic regional structure in the influenza activity, thus safely restricting ourselves to the analysis of the yearly influenza behavior. Then we collect a massive number of possible region-wise indicators contributing to the influenza mortality, such as consumption, immunization, sanitation, water quality, and other indicators from external data, with $1170$ dimensions in total. We extract significant features from the high dimensional indicators using a combination of data analysis techniques, including matrix completion, support vector machines (SVM), autoencoders, and principal component analysis (PCA). Furthermore, we model the international flow of migration and trade as a convolution on regional influenza activity, and solve the deconvolution problem as higher-order perturbations to the linear regression, thus separating regional and international factors related to the influenza mortality. Finally, both the original model and the perturbed model are tested on regional examples, as validations of our models. Pertaining to the policy, we make a proposal based on the connectivity data along with the previously extracted significant features to alleviate the impact of influenza, as well as efficiently propagate and carry out the policies. We conclude that environmental features and economic features are of significance to the influenza mortality. The model can be easily adapted to model other types of infectious diseases.
AI combined with EHR and other data improves influenza forecasting
With influenza cases elevated nationally and widespread throughout the country, researchers led by Boston Children's Hospital contend that machine learning can produce highly accurate local flu surveillance. In fact, they say that combining two forecasting methods with artificial intelligence produces the most accurate estimates of flu activity available to date--a week ahead of traditional healthcare-based reports, at the state level across the United States. While the Centers for Disease Control and Prevention monitors influenza-like illnesses (ILI) in the U.S. by gathering information from physicians' reports about patients with ILI seeking medical attention, the availability of the data has a lag time of as much as two weeks. However, in a study published on Friday in Nature Communications, researchers say they have successfully combined Google search frequencies and electronic health record data with spatio-temporal trends in influenza activity to produce forecasts with higher correlation and lower errors than all other tested models for current ILI activity at the state level. "We believe that the accuracy of our method involves a balance between responsiveness and robustness," state the authors.
Cloud-based Electronic Health Records for Real-time, Region-specific Influenza Surveillance
Santillana, Mauricio, Nguyen, Andre, Louie, Tamara, Zink, Anna, Gray, Josh, Sung, Iyue, Brownstein, John S.
Introduction Influenza is a leading cause of death in the United States (US), where up to 50,000 are killed each year by influenza- ‐like illnesses (ILI) [1]. Therefore, monitoring, early detection, and prediction of influenza outbreaks are crucial to public health. Disease detection and surveillance systems provide epidemiologic intelligence that allows health officials to deploy preventive measures and help clinic and hospital administrators make optimal staffing and stocking decisions [2]. The US Centers for Disease Control and Prevention (CDC) monitors ILI in the US by gathering information from physicians' reports about patients with ILI seeking medical attention [3]. CDC's ILI data provides useful estimates of influenza activity; however, its availability has a known time lag of one to two weeks. This time lag is far from optimal since public health decisions need to be made based on information that is two weeks old. Systems capable of providing real- ‐time estimates of influenza activity are, thus, critical. Many attempts have been made to design methods capable of providing real- ‐time estimates of ILI activity in the US by leveraging Internet- ‐based data sources that could potentially measure ILI in an indirect manner [4, 5, 6, 7, 8, 9, 10, 11].